Goto

Collaborating Authors

 neural network ensemble




Calibrated and uncertain? Evaluating uncertainty estimates in binary classification models

arXiv.org Machine Learning

Rigorous statistical methods, including parameter estimation with accompanying uncertainties, underpin the validity of scientific discovery, especially in the natural sciences. With increasingly complex data models such as deep learning techniques, uncertainty quantification has become exceedingly difficult and a plethora of techniques have been proposed. In this case study, we use the unifying framework of approximate Bayesian inference combined with empirical tests on carefully created synthetic classification datasets to investigate qualitative properties of six different probabilistic machine learning algorithms for class probability and uncertainty estimation: (i) a neural network ensemble, (ii) neural network ensemble with conflictual loss, (iii) evidential deep learning, (iv) a single neural network with Monte Carlo Dropout, (v) Gaussian process classification and (vi) a Dirichlet process mixture model. We check if the algorithms produce uncertainty estimates which reflect commonly desired properties, such as being well calibrated and exhibiting an increase in uncertainty for out-of-distribution data points. Our results indicate that all algorithms are well calibrated, but none of the deep learning based algorithms provide uncertainties that consistently reflect lack of experimental evidence for out-of-distribution data points. We hope our study may serve as a clarifying example for researchers developing new methods of uncertainty estimation for scientific data-driven modeling.


Out-of-Distribution Runtime Adaptation with Conformalized Neural Network Ensembles

arXiv.org Artificial Intelligence

We present a method to integrate real-time out-of-distribution (OOD) detection for neural network trajectory predictors, and to adapt the control strategy of a robot (e.g., a self-driving car or drone) to preserve safety while operating in OOD regimes. Specifically, we use a neural network ensemble to predict the trajectory for a dynamic obstacle (such as a pedestrian), and use the maximum singular value of the empirical covariance among the ensemble as a signal for OOD detection. We calibrate this signal with a small fraction of held-out training data using the methodology of conformal prediction, to derive an OOD detector with probabilistic guarantees on the false-positive rate of the detector, given a user-specified confidence level. During in-distribution operation, we use an MPC controller to avoid collisions with the obstacle based on the trajectory predicted by the neural network ensemble. When OOD conditions are detected, we switch to a reachability-based controller to guarantee safety under the worst-case actions of the obstacle. We verify our method in extensive autonomous driving simulations in a pedestrian crossing scenario, showing that our OOD detector obtains the desired accuracy rate within a theoretically-predicted range. We also demonstrate the effectiveness of our method with real pedestrian data. We show improved safety and less conservatism in comparison with two state-of-the-art methods that also use conformal prediction, but without OOD adaptation.


Accuracy of TextFooler black box adversarial attacks on 01 loss sign activation neural network ensemble

arXiv.org Artificial Intelligence

Recent work has shown the defense of 01 loss sign activation neural networks against image classification adversarial attacks. A public challenge to attack the models on CIFAR10 dataset remains undefeated. We ask the following question in this study: are 01 loss sign activation neural networks hard to deceive with a popular black box text adversarial attack program called TextFooler? We study this question on four popular text classification datasets: IMDB reviews, Yelp reviews, MR sentiment classification, and AG news classification. We find that our 01 loss sign activation network is much harder to attack with TextFooler compared to sigmoid activation cross entropy and binary neural networks. We also study a 01 loss sign activation convolutional neural network with a novel global pooling step specific to sign activation networks. With this new variation we see a significant gain in adversarial accuracy rendering TextFooler practically useless against it. We make our code freely available at \url{https://github.com/zero-one-loss/wordcnn01} and \url{https://github.com/xyzacademic/mlp01example}. Our work here suggests that 01 loss sign activation networks could be further developed to create fool proof models against text adversarial attacks.


Sensitivity-Aware Amortized Bayesian Inference

arXiv.org Machine Learning

Bayesian inference is a powerful framework for making probabilistic inferences and decisions under uncertainty. Fundamental choices in modern Bayesian workflows concern the specification of the likelihood function and prior distributions, the posterior approximator, and the data. Each choice can significantly influence model-based inference and subsequent decisions, thereby necessitating sensitivity analysis. In this work, we propose a multifaceted approach to integrate sensitivity analyses into amortized Bayesian inference (ABI, i.e., simulation-based inference with neural networks). First, we utilize weight sharing to encode the structural similarities between alternative likelihood and prior specifications in the training process with minimal computational overhead. Second, we leverage the rapid inference of neural networks to assess sensitivity to various data perturbations or pre-processing procedures. In contrast to most other Bayesian approaches, both steps circumvent the costly bottleneck of refitting the model(s) for each choice of likelihood, prior, or dataset. Finally, we propose to use neural network ensembles to evaluate variation in results induced by unreliable approximation on unseen data. We demonstrate the effectiveness of our method in applied modeling problems, ranging from the estimation of disease outbreak dynamics and global warming thresholds to the comparison of human decision-making models. Our experiments showcase how our approach enables practitioners to effectively unveil hidden relationships between modeling choices and inferential conclusions.


Minibatch training of neural network ensembles via trajectory sampling

arXiv.org Artificial Intelligence

Most iterative neural network training methods use estimates of the loss function over small random subsets (or minibatches) of the data to update the parameters, which aid in decoupling the training time from the (often very large) size of the training datasets. Here, we show that a minibatch approach can also be used to train neural network ensembles (NNEs) via trajectory methods in a highly efficient manner. We illustrate this approach by training NNEs to classify images in the MNIST datasets. This method gives an improvement to the training times, allowing it to scale as the ratio of the size of the dataset to that of the average minibatch size which, in the case of MNIST, gives a computational improvement typically of two orders of magnitude. We highlight the advantage of using longer trajectories to represent NNEs, both for improved accuracy in inference and reduced update cost in terms of the samples needed in minibatch updates.


Probabilistic Solar Proxy Forecasting with Neural Network Ensembles

arXiv.org Artificial Intelligence

Space weather indices are used commonly to drive forecasts of thermosphere density, which directly affects objects in low-Earth orbit (LEO) through atmospheric drag. One of the most commonly used space weather proxies, $F_{10.7 cm}$, correlates well with solar extreme ultra-violet (EUV) energy deposition into the thermosphere. Currently, the USAF contracts Space Environment Technologies (SET), which uses a linear algorithm to forecast $F_{10.7 cm}$. In this work, we introduce methods using neural network ensembles with multi-layer perceptrons (MLPs) and long-short term memory (LSTMs) to improve on the SET predictions. We make predictions only from historical $F_{10.7 cm}$ values, but also investigate data manipulation to improve forecasting. We investigate data manipulation methods (backwards averaging and lookback) as well as multi step and dynamic forecasting. This work shows an improvement over the baseline when using ensemble methods. The best models found in this work are ensemble approaches using multi step or a combination of multi step and dynamic predictions. Nearly all approaches offer an improvement, with the best models improving between 45 and 55\% on relative MSE. Other relative error metrics were shown to improve greatly when ensembles methods were used. We were also able to leverage the ensemble approach to provide a distribution of predicted values; allowing an investigation into forecast uncertainty. Our work found models that produced less biased predictions at elevated and high solar activity levels. Uncertainty was also investigated through the use of a calibration error score metric (CES), our best ensemble reached similar CES as other work.


Training neural network ensembles via trajectory sampling

arXiv.org Artificial Intelligence

In machine learning, there is renewed interest in neural network ensembles (NNEs), whereby predictions are obtained as an aggregate from a diverse set of smaller models, rather than from a single larger model. Here, we show how to define and train a NNE using techniques from the study of rare trajectories in stochastic systems. We define an NNE in terms of the trajectory of the model parameters under a simple, and discrete in time, diffusive dynamics, and train the NNE by biasing these trajectories towards a small time-integrated loss, as controlled by appropriate counting fields which act as hyperparameters. We demonstrate the viability of this technique on a range of simple supervised learning tasks. We discuss potential advantages of our trajectory sampling approach compared with more conventional gradient based methods.


Neural Network Ensembles, Cross Validation, and Active Learning

Neural Information Processing Systems

Learning of continuous valued functions using neural network en(cid:173) sembles (committees) can give improved accuracy, reliable estima(cid:173) tion of the generalization error, and active learning. The ambiguity is defined as the variation of the output of ensemble members aver(cid:173) aged over unlabeled data, so it quantifies the disagreement among the networks. It is discussed how to use the ambiguity in combina(cid:173) tion with cross-validation to give a reliable estimate of the ensemble generalization error, and how this type of ensemble cross-validation can sometimes improve performance. It is shown how to estimate the optimal weights of the ensemble members using unlabeled data. By a generalization of query by committee, it is finally shown how the ambiguity can be used to select new training data to be labeled in an active learning scheme.